Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add audit table for data migration scripts #4095

Merged
merged 2 commits into from
Dec 19, 2024

Conversation

jessicamcinchak
Copy link
Member

@jessicamcinchak jessicamcinchak commented Dec 19, 2024

Following on from data migration chat! This is a basic table structure that will help us both: queue up which flows should be updated by a given data migration script and track which have been updated so far.

Exclusively the platformAdmin role can access and update this table.

I'm imagining:

  • We'll simply manually populate the table like so:
INSERT INTO temp_data_migrations_audit (flow_id, team_id)
SELECT id, team_id
FROM flows 
WHERE team_id IN (1,2,3); -- team_ids match councils that have "consented"
  • Then kick off the migration script which will fetch/transform/update one flow/row at a time where updated = false until all records have been updated
  • Finally after a successful migration, we can simply TRUNCATE the table and re-use the same structure for our next use-case (plus option to export as CSV like a receipt and store on gdrive before truncation if necessary)

Copy link

github-actions bot commented Dec 19, 2024

🤖 Hasura Change Summary compared a subset of table metadata including permissions:

Tracked Tables (1)

@jessicamcinchak jessicamcinchak requested a review from a team December 19, 2024 15:57
Comment on lines +2 to +3
"flow_id" uuid NOT NULL,
"team_id" integer NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would relationships here be helpful for queries in the migration scripts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so ! As I see it we'll simply return the flow_id from this table (it's the primary key so indexed) then pass into a GraphQL relational query which will fetch the flow and its' latest published version (so meaningful foreign key is at GraphQL & existing flows table level rather than on this one?). But please object if I'm missing a benefit of fkeys directly here!

Copy link
Member Author

@jessicamcinchak jessicamcinchak Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof slow brain today - answered my own question there - you're totally right, if we set a proper relationship to flows here then migration script will be able to make a single query rather than two 👍 update incoming!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query MyQuery {
  temp_data_migrations_audit(where: {updated: {_eq: false}}, limit: 1) {
    flow_id
    flows {
      data
      published_flows(order_by: {created_at: desc}, limit: 1) {
        data
      }
    }
  }
}

Copy link

github-actions bot commented Dec 19, 2024

Removed vultr server and associated DNS entries

@jessicamcinchak jessicamcinchak merged commit 14dfb4a into main Dec 19, 2024
12 checks passed
@jessicamcinchak jessicamcinchak deleted the jess/data-migrations-audit-table branch December 19, 2024 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants